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INTRODUCTION 


To meet the new and more rigorous college- and career-ready standards for student learning, all 
of today’s students must have access to effective teaching — every day and in every classroom. As 
teachers and their school leaders are increasingly held accountable for implementing consistently 
effective teaching, calls for holding the programs that prepare them accountable have increased 
in kind. State and federal policymakers are therefore seeking to change how teacher preparation 
programs are evaluated — for the purposes of accountability and support. 

This brief explores research that points to the opportunities and the challenges that evaluating 
teacher preparation programs differently presents. To begin laying the groundwork for the complex 
work ahead, we provide information on the following policy-relevant questions: 

■ What is the current status of teacher preparation program accountability and support? 

■ How is teacher preparation program evaluation changing? 

■ What are some ways to evaluate programs using the evidence of the quality of program 
processes? 

■ What are some ways to evaluate programs using the evidence of impact on outcomes? 

■ What are the strengths, weaknesses, opportunities, and challenges of each approach? 

■ How are states on the forefront of change — Louisiana, Texas, Tennessee, North Carolina, 
Ohio, and Florida — approaching the evaluation of their teacher preparation programs? 

We offer answers to these questions but do not suggest that one particular approach is necessarily 
superior to another; we provide a resource for state education agency personnel and other 
state-level stakeholders to use as they redesign systems of teacher preparation program 
accountability and support. 
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RETHINKING TEACHER PREPARATION 
ACCOUNTABILITY AND SUPPORT POLICIES: 
MOVING TOWARD A MORE RESULTS- 
ORIENTED APPROACH 

What Is the Current Status of Teacher Preparation Program 
Accountability and Support? 

As policies increasingly hold teachers accountable for their 
performance, calls for holding the preparation programs that 
prepare them accountable for their performance have also 
increased. Currently, states use several mechanisms to hold 
teacher preparation programs accountable for the quality of 
teachers being produced. 

Most states have three levers for regulating program quality: 

1. Approval. State departments of education set program 
approval requirements and standards, typically requiring 
that teacher preparation programs apply for initial approval 
and then submit to periodic reviews conducted by panels 
of educators from across the state. 

2. Accreditation. Many states encourage or require teacher preparation programs to seek 
accreditation from a nongovernmental accrediting agency — such as the National Council 
for the Accreditation of Teacher Education (NCATE) 1 or the Teacher Education Accreditation 
Council (TEAC) — that reviews each program against the agency’s national standards. 

3. Certification. All states require that the graduates of teacher preparation programs meet 
minimum standards for certification, such as passing state tests of basic skills, holding a 
degree in a specific subject area, and completing coursework in particular domains. Such 
certification requirements act as a mechanism for program accountability insofar as the 
programs must ensure that their candidates meet these standards so that the public will 
view the programs as viable. 

However, observers have pointed out that these mechanisms exert variable control on teacher 
quality. For example, although teacher preparation programs must earn state approval to recommend 
teachers for state licensure, these approval processes vary widely, are rarely evidence based, and 
are monitored infrequently through compliance-oriented expectations (National Research Council, 
2010; Wilson & Youngs, 2005). Moreover, standards and processes for both approval and 
accreditation can be inefficient and may have requirements with little empirical justification (Allen, 
2003; Crowe, 2010; National Research Council, 2010). Many argue that certification requirements, 


“Too many future teachers 
graduate from prep programs 
unprepared for success in the 
classroom. We have to give 
teachers the support they need to 
ensure that children get the high 
quality education they deserve. 
Our goal is to develop a system 
that recognizes and rewards good 
programs, and encourages all of 
them to improve.” 

(U.S. Department of Education, 2011a) 


i 


Beginning in 2013, the accrediting functions of NCATE and TEAC will merge and transfer to a new organization, the Council 
for the Accreditation of Educator Preparation (CAEP). CAEP standards are being developed throughout 2012. 
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which tend to rely heavily on teacher testing, 
are a “crude proxy for teacher quality” (Walsh, 
2002, p. 84) and are based on poor indicators 
of quality (Crowe, 2010). Finally, research on 
the effects of these processes on teacher 
quality and teacher effectiveness is simply 
inconclusive (Allen, 2003; National Research 
Council, 2010; Wilson & Youngs, 2005). 

Considerable circumstantial evidence supports 
that these accountability mechanisms do not 
seem to ensure that each state’s new teachers 
are ready for the classroom. Too many beginning 
teachers report that they do not feel well 
prepared when they enter the classroom, and 
their supervisors often agree (Levine, 2006). 
For example, data from the 2007-08 Schools 
and Staffing Survey of the National Center 
for Education Statistics indicate that only 
20 percent of teachers in their first year of 
teaching felt very prepared to select and 
adapt curriculum materials, handle a wide 
range of classroom management and discipline 
situations, and assess students (see Figure 1; 
National Comprehensive Center for Teacher 
Quality [2010]). In addition, the observation 
that student achievement lags behind other 
countries (National Center for Education 
Statistics, 2011) has sparked debates about 
the effectiveness of teacher preparation in 
the United States (Boyd, Grossman, Lankford, 
Loeb, & Wycoff, 2009b; Wiseman, 2012). 

Since the reauthorization of the Higher 
Education Act (HEA) in 1998, federal 
policymakers have sought to implement 
data collection that would yield systematic 
information on the characteristics and the 
outcomes of teacher preparation programs. 

The annual reporting requirements mandated 
in HEA Title II represent the first step in 
systematizing data collection, using common 
definitions, and making information public. 
Title II requires that states provide the secretary 
of education with multiple input, process, and 
candidate outcome data points. These data 
points include the pass rates on assessments 
used by states in certifying or licensing teachers, 
requirements for teaching certificates and 


Figure 1. Results of the 2007-08 Schools 
and Staffing Survey 


In their first year of teaching, how well prepared were new teachers to handle a range of 
classroom management or discipline situations? 
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In their first year of teaching, how well prepared were new teachers to select and adapt 
curriculum and instructional materials? 

Location: National Gender: Both 
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In their first year of teaching, how well prepared were new teachers to assess students? 
Location: National Gender: Both 
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From the 2007-08 Schools and Staffing Survey from the National Center 
for Education Statistics; analysis by the National Comprehensive Center 
for Teacher Quality. 
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licensure, state efforts in the past year to improve teaching, descriptions of alternate routes to 
licensure, and information on each teacher preparation program in the state (e.g., admissions 
requirements, enrollment, and supervised clinical experience information; State Report 2010 — 
Alabama, 2012). In all, states must report 440 data elements each year (Duncan, 2011). 

In addition, HEA requires that the states implement procedures for identifying and assisting low- 
performing teacher preparation programs. As of 2007, 31 states never identified a program as 
either at risk or low performing (Carey, 2007). In 2010, 11 states identified 38 low-performing 
teacher preparation programs (U.S. Department of Education, 2011c). Thus, despite the increased 
publication of data and recent accountability efforts, policy leaders question the utility of the Title II 
reporting requirements. 


How Is Teacher Preparation Program Evaluation Changing? 

Renewed dialogue around teacher effectiveness and teacher preparation has spurred new initiatives, 
efforts, and calls for reform among nonprofit organizations and accrediting agencies. Some of the 
recent efforts include the following: 

■ In 2010, NCATE released its Blue Ribbon Panel report, which called for increased selectivity, 
accountability, and clinically based preparation in teacher preparation programs (Wiseman, 2012). 

■ The National Academy of Education, which is funded by the National Science Foundation, 
convened a panel of teacher educators to “synthesize research and experiential knowledge 
about existing approaches to evaluating teacher preparation" and “create a design framework 
for the development of new and innovative approaches,” paying particular attention to the 
stakeholders that will use the information for improvement, accountability, and equity (National 
Academy of Education, n.d.). 

■ In 2012, CAEP (formerly NCATE and TEAC) announced that it would form the Commission on 
Standards and Reporting to develop new accreditation standards for teacher preparation that 
will use multiple measures and focus on outcome data and key program characteristics (CAEP 
2012). Furthermore, CAEP anticipates that the influx of new measures and more data on 
teacher preparation will enable teacher preparation programs and accrediting agencies to 
improve these programs and make better-informed judgments related to program quality 
(Cibulka, 2012). 

■ In 2013, the National Council on Teacher Quality (NCTQ), in partnership with U.S. News & 
World Report, will release its national review and rankings of approximately 1,000 teacher 
preparation programs across the country (NCTQ, 2011a). 2 

The Obama administration has also called for revised policies. Its 2011 reform plan — Our Future, 

Our Teachers: The Obama Administration’s Plan for Teacher Education Reform and Improvement — 
noted that reporting and accountability requirements “have not led to meaningful change" and 
questioned whether the HEA data points are based on “meaningful indicators of program 
effectiveness” (U.S. Department of Education, 2011b, p. 9). Furthermore, U.S. Secretary of 


2 These rankings are based on 18 standards and indicators of program selectivity, the content of preparation courses in 
terms of what teachers should know and be able to do, whether a program collects data related to the outcomes of its 
graduates, and whether the graduates meet state thresholds in terms of impact on student learning (NCTQ, 2011b). The 
evidence used in scoring the programs includes admissions standards, course syllabi, textbooks, student teaching policy 
handbooks, and programmatic outcome data (NCTQ, 2011b). 
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Education Arne Duncan challenged the need for these requirements, arguing that gathering data 
on input measures wastes the time and the limited resources of teacher preparation programs. 
Instead, he suggested a shift in focus from program inputs to program outputs (Duncan, 2011). 
Ultimately, the Obama administration has offered several alternatives for a more streamlined 
reporting system focused on outcome measures, including the following: 

■ Aggregate the learning outcomes of K-12 students taught by the graduates of teacher 
preparation programs, using “multiple, valid measures of student achievement to reliably 
ascertain growth associated with graduates of preparation programs” (U.S. Department of 
Education, 2011b, p. 10). 

■ Identify the job placement and retention rates of the graduates of teacher preparation 
programs, with particular attention to shortage areas. 

■ Collect the perceptions of performance and effectiveness via surveys of the graduates of 
teacher preparation programs and their principals. 

Various stakeholders discussed these measures in Department of Education (ED) negotiated 
rule-making sessions in early 2012. 3 After several months of discussions, the negotiators appeared 
deadlocked; according to Hunter College Dean David Steiner, “long-standing divisions have reemerged” 
regarding which measures to use for teacher preparation program accountability (Sawchuk, 2012b). 
Because ED declined to extend the rule-making process beyond April 2012, the department will 
now craft its own rules (Sawchuk, 2012b). 

Separate from HEA reporting requirements, the $4 billion Race to the Top competitive grant 
program required the winning Round 1 and Round 2 states to adopt more rigorous accountability 
mechanisms for teacher preparation. The winners committed to (1) linking data on the achievement 
and the growth of teachers’ students back to the preparation programs that prepared those teachers, 

(2) publicly reporting this information for each teacher preparation program in the state, and 

(3) working to expand programs that are successful at producing graduates who are effective 
teachers (Wiseman, 2012). Crowe’s (2011) analysis of the Round 1 and Round 2 Race to the Top 
grant recipients found that they all plan to publicly disclose the student achievement data of the 
graduates of teacher preparation programs, and five winners will use that information for program 
accountability. Other measures of teacher preparation programs that the Race to the Top winners 
plan to use and disclose include persistence in the teaching rates of the graduates of teacher 
preparation programs, job placement data, and achieving advanced licensure. The plans of the 
Race to the Top winners illustrate the renewed interest in revising how the states evaluate teacher 
preparation programs. (We profile the efforts of four Race to the Top states later in this brief.) 

So where does this brief fit in? The success and the usefulness of accountability efforts are 
dependent on the quality of the measures used and how states, teacher preparation programs, 
and individuals use the data gathered from these measures. Therefore, as the states and ED 


3 See Steve Sawchuck's reporting on these sessions in his Teacher Beat blog: 

• http://blogs.edweek.org/edweek/teacherbeat/2012/01/the_us_department_of_this.html 

• http://blogs.edweek.org/edweek/teacherbeat/2012/01/negotiators_tackle_teacher_ed.html 

• http://blogs.edweek.org/edweek/teacherbeat/2012/01/day_2_of_teacher-ed_rulemaking.html 

• http://blogs.edweek.org/edweek/teacherbeat/2012/02/draft_regulations_would_unite.html 

• http://blogs.edweek.org/edweek/teacherbeat/2012/02/negotiators_weighjnputs_vs_ou.html 

• http://blogs.edweek.org/edweek/teacherbeat/2012/04/a_last-minute_repreive_on_new.html 

• http://blogs.edweek.org/edweek/teacherbeat/2012/04/negotiators_seemed_no_more_rea.html 

• http://blogs.edweek.org/edweek/teacherbeat/2012/04/deadlocked_negotiators_fail_to.html 

• http://blogs.edweek.org/edweek/teacherbeat/2012/04/teacher-prep_rulemaking_is_con.html 
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revise preexisting accountability systems for teacher preparation programs, careful consideration 
of the available measures is needed. This brief explores the research, underscores potential 
measures and their opportunities and challenges, and gives recommendations for moving ahead. 
(The appendix summarizes some of the strengths and weaknesses of each approach.) We also 
present examples of how six states are developing new metrics for teacher preparation programs, 
combining them, and using them. 


What Are Some Ways to Evaluate Programs Using 
the Evidence of the Quality of Program Processes? 

Moving away from measuring program inputs — such as faculty qualifications, faculty-student ratios, 
competitiveness rankings, enrollment data, or general requirements — toward measuring more 
meaningful processes moves the field closer toward using measures that will provide more useful 
information for stakeholders desiring to ensure program improvement and accountability. Although 
the research is still not definitive, a growing consensus suggests that three aspects of program 
processes are important for program effectiveness: (1) program selection, (2) program content 
(i.e., what is taught in the teacher preparation program), and (3) program structure (i.e., the 
extent to which candidates have access to high-quality clinical experiences throughout their 
preservice experience; NCATE, 2010). Researchers are currently seeking to figure out how to 
disentangle the measurement of these processes to determine which are most important — and 
in what configuration — for program effectiveness (see Boyd et al . , 2009b; National Center for 
Analysis of Longitudinal Data in Education Research, 2012). Nevertheless, states and other 
stakeholders seeking to evaluate teacher preparation programs on the basis of these processes 
have several options. It is important to note that no single measure of program processes is 
sufficient to judge the quality or the effectiveness of teacher preparation, and all the measurement 
options described in the following sections have both strengths and limitations. 

Candidate Selection Processes for Teacher Preparation Programs 

An important aspect of teacher preparation program quality is how the programs recruit and select 
teacher candidates. The evidence suggests that small but significant correlations exist between 
various measures of individuals’ aptitude before entering a program and their eventual teaching 
effectiveness (Henry, Bastian, & Smith, 2012; Levine, 2006; Rice, 2003). For example, a recent 
study in North Carolina assessed the effects of candidate aptitude on several outcomes. Teachers 
who received merit-based scholarships and graduated from North Carolina public institutions of 
higher education (IHEs; mean SAT approximately 1,167) produced higher student achievement 
in the elementary, middle, and high school grades and persisted at higher rates in the teaching 
profession compared to other North Carolina public IHE graduates who did not receive scholarships 
(mean SAT approximately 1,025; Henry et al., 2012). 

Research on Teach for America (TFA) provides a compelling example of the link between selectivity 
and outcomes. TFA teachers appear to have the highest aptitude scores (SAT and/or ACT) of any 
sizable group entering the teaching profession (Koop & Farr, 2011). In a 2008 study by Kane and 
his colleagues, the SAT scores of TFA teachers exceeded those of traditionally trained teachers 
by approximately 0.4 to 0.5 standard deviation in mathematics and reading, respectively (Kane, 
Rockoff, & Staiger, 2008). TFA teachers generally produce student achievement equal to or 
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higher than teachers who did not participate in TFA, particularly in mathematics (Decker, Mayer, 

& Glazerman, 2004; Henry et al., 2012; Kane et al., 2008). However, it is not clear whether these 
effects are due entirely to more rigorous selection or the unique training that TFA teachers undergo 
after being selected. Although limited evidence exists that more selective programs may produce 
more effective teachers, there is no research that examines the utility of tying candidate selection 
processes to accountability measures for teacher preparation programs. 

Nevertheless, using candidate selection processes as a measure of the quality of teacher 
preparation programs presents some strengths. School and district leaders responding to parent 
demands for academically talented teachers may wish to know which programs are most selective 
and focus their recruitment efforts at those institutions. Meanwhile, teacher preparation programs 
may want to know what selection criteria and processes produce the most effective graduates so 
they can continuously improve these important aspects of their programs. 

Measures of the cognitive competence of candidates exist at virtually all IHEs with teacher 
preparation programs. Easily available measures include high school grade point average (GPA) 
and class rank, SAT or ACT domain and composite scores, placement tests given by IHEs 
to determine a candidate’s readiness for coursework, and college grades in education and 
noneducation classes. College grades in a major area that matches the teaching preparation 
specialty (e.g., mathematics) may be used as well. However, there may be less easily observable 
or measurable aspects of candidate aptitude that have a greater effect on eventual teaching 
effectiveness than measures of cognitive competence, such as a strong internal locus of control, 
sensitivity, and the ability to persist in the face of difficulty (e.g., Farr, 2010; Rimm-Kauffman et al., 
2002). More research on best practices in candidate selection is clearly warranted. 

Teacher Preparation Program Content 

Studying the course content in teacher preparation programs generally relies on analyses of course 
syllabi, which focus on the content covered and the course requirements. Syllabi reviews are an 
integral part of the NCATE review process and continue to be used by other organizations to make 
inferences about the content of teacher preparation programs (see Greenberg & Walsh, 2012; 
Walsh, Glaser, & Wilcox, 2006). As NCTQ notes, course syllabi, in a sense, are “the basic units 
of the design of teacher preparation programs” because they highlight the key content that will 
be covered in the courses (NCTQ, 2012). 

Syllabi reviews, when reviewed systematically and coded consistently, present reviewers and IHEs 
with opportunities to learn. They can provide greater insight into instruction than the number of 
course hours or a listing of courses. Syllabi can help identify the quality and the content of courses 
across both IHEs and course sections and provide future employers insight into what their new 
hires should know and what more professional development in terms of content they will need. 
Furthermore, teacher preparation programs can use the results of syllabi reviews to revamp 
courses and improve instruction. (For examples of syllabi review tools, see The National 
Comprehensive Center for Teacher Quality’s Innovation Configurations .) 

Nonetheless, the use of syllabi is limited. Syllabi are implied contracts between the IHE, the college, 
the department, the program, the instructor, and students and are carefully prepared by most IHE 
faculty (Parkes & Harris, 2002). However, such documents may not fully capture what is actually 
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taught in each course. Some content in a course syllabus may not be taught, and other content not 
listed in the syllabus may, in fact, be taught. Although imperfect and insufficient as a single measure 
of the quality of a teacher preparation program, course syllabi are one of the best currently available 
measures of the content and course requirements in teacher preparation coursework. 


For example, as part of its teacher preparation program approval and reauthorization processes, the 
Colorado Department of Education (2010) conducts content reviews of teacher preparation programs 
by examining the syllabi of all required program courses. This review seeks to ensure that the 
program aligns with the eight teacher performance-based standards and additional endorsement 
standards in Colorado. Reviewers use a rubric to determine the extent to which the syllabi meet 
state requirements and also use an alignment tool to determine the extent to which the syllabi align 
with the standards for licensing teacher education candidates. This process uses syllabi as a proxy 
for the knowledge imparted and the skills developed in teacher candidates through instruction. 


The Office of Special Education Programs (OSEP) also uses course syllabi to evaluate teacher 
preparation programs. Those programs applying for Individuals with Disabilities Education Act 
discretionary grants must include in their application appendixes copies of course syllabi for 
all coursework in the major and for all research methods, evaluation methods, or data analysis 
courses that either the program requires or students have elected to take in the last 5 years (U.S. 
Department of Education, 2012b). OSEP reviews the syllabi to see if the course content meets the 
requirements specified in the grant. 4 



The National Comprehensive Center for Teacher Quality’s Innovation Configurations 


To promote the implementation of evidence-based instructional practices in teacher preparation 
activities, the National Comprehensive Center for Teacher Quality (TQ Center) offers seven 
rubrics using coding systems that IHEs, states, and other interested stakeholders can use to 
monitor and evaluate the quality of course syllabi. (An excerpt is given in Figure 2.) Each 
innovation configuration is based on research on best practices and accompanies a TQ 
Connection Issue Paper or Research & Policy Brief. The users of these tools can assess the 
level of implementation on five levels: not mentioned; mentioned; mentioned with lecture and/or 
reading assigned; a project, test, or paper assigned; and implementation with supervision and 
feedback. Innovation configurations cover the following topics: 

• Scientifically based reading instruction 

• Classroom organization and behavior management 

• Inclusive services 

• Learning strategy instruction 

• Response to intervention 

• Linking assessment and instruction 

• Evidence-based mathematics instruction 


For more information, see http://www.tqsource.org/publications/innovationconfigurations.php. 


4 For example, applicants for Type B programs must demonstrate that courses include or will incorporate research and 
evaluation findings on using outcome and achievement data in evaluating the effectiveness of early intervention providers; 
discuss methodological and statistical considerations in conducting an evaluation of the effectiveness of early learning 
personnel; and engage students in reviewing, critiquing, or participating in evaluations of the effectiveness of early 
intervention providers or personnel (U.S. Department of Education, 2012b). 



Research & Policy Brief 


9 


Figure 2. An Excerpt of the Scientifically Based Reading Instruction Innovation Configuration 
From the TQ Center 


Variations 

Essential Components 

Code = 0 

Code = 1 

Code = 2 

Code = 3 

Code = 4 

Rating 

Instructions: Place an X under the appropriate There is no evidence 

variation implementation score for each course ^ at component 

syllabus that meets the criteria specified, from ' s included in the 

0 to 4. Score and rate each item separately. c ' ass syllabus. 

Descriptors and examples are bulleted below 
each of the components. 

Syllabus mentions 

content related to 
the component. 

Syllabus mentions Syllabus mentions 

the component and the component and 
requires readings requires readings, 

and tests or quizzes, tests or quizzes, 
and assignments 
or projects for 
application. 

• Observations 

• Lesson plans 

• Classroom 
demonstration 

• Journal response 

Syllabus mentions 
the component and 
requires readings, 
tests or quizzes, 
assignments or 
projects, and 
teaching with 
application and 
feedback. 

• Fieldwork 
(practicum) 

• Tutoring 

Rate each item as 
the number of the 
highest variation 
receiving an X 
under it. 

Phonemic Awareness 

(This topic is ideally subsumed under the broader 
topic Phonological Awareness.) 

• Individual speech sounds, phonemes 

• Early indicator of risk 

• Precursor to phonics 

• Detect, segment, blend, manipulate phonemes 
(sounds) (e.g., /b/ /a/ /t/ = bat) 

• Rhyming, alliteration in preschool and kindergarten 

• Elkonin boxes (common activity) 







Phonics 

• Correspondence of sounds and letters 

• Phoneme-grapheme correspondences 

• Blending, decoding, encoding 

• Syllable types 

• Prefixes, suffixes, base words 

• Nonsense words (assessment) 

• Alphabetic Principle 

• Word analysis 

• Words composed of letters (graphemes) that map 
to phonemes 

• Letters and sounds working in systematic way 








Source: Smartt & Reschly (2011, p. 3). 


Clinical and Student-Teaching Experiences 

Measures of the quality of clinical and student-teaching experiences may also be used to assess 
teacher preparation programs. Research provides some limited evidence that clinical and student- 
teaching experiences provide teacher candidates with opportunities to learn about teaching and 
help reduce anxiety among those entering the profession (Rice, 2003). In addition, Boyd et al. (2009b) 
found that teacher preparation programs that required more oversight of student-teaching experiences 
or required students to complete a capstone report produced first-year teachers who were significantly 
more effective at increasing student achievement (p. 434). 

By including some indicators related to clinical and student-teaching experiences, evaluators 
recognize that a knowledge base of effective practices exists and can be transmitted to novice 
teachers to improve student results. Most of the states (41 of 50) have set requirements regarding 
the length of student teaching, and 15 states require other clinical experiences (EPE Research 
Center, 2012). These data are easy to collect, but they do not provide detailed information about 
the quality of field experiences. 
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Surveys and document reviews are two possible ways to assess the quality of clinical and student- 
teaching experiences. Surveys and document reviews require systematic analysis. In addition, 
a paucity of research in this area provides little information about which features of clinical and 
student-teaching experiences are most important. Nonetheless, both measures provide greater 
insight into the quality of teacher preparation programs than the number of hours devoted to 
student teaching alone. 

Surveys can be low-cost measures that gather information directly from those most impacted by 
student-teaching experiences — teacher candidates. In fact, many teacher preparation programs 
already survey their graduates. However, surveys are subject to bias and rely heavily on perception 
rather than actuality. Document reviews draw on preexisting documents and may provide valuable 
information into the structure, the format, the requirements, and the expectations of student 
teaching in a given teacher preparation program. Like syllabi reviews, however, document reviews 
may uncover intentions rather than practices. 

The Texas State Board of Educator Certification uses surveys to determine whether the minimum 
standards for student teaching have been met and if the experiences are of high quality. Texas 
requires that teacher candidates be observed for 45 minutes at least 3 times within their 12-week 
student-teaching or clinical experiences (19 TAC §228. 35(f)) and sets compliance percentages 
annually. To assess whether these percentages are met and whether the teacher preparation 
program provided high-quality field supervision, the Texas Education Agency uses a subset of 
survey questions of the graduates of teacher preparation programs to gather information about 
their field experiences (see Sample Questions From the Candidate Exit Survey). 

An NCTQ-published study by Greenberg, Pomerance, and Walsh (2011) used surveys and 
document analyses to assess the quality of clinical and student-teaching experiences. NCTQ 
collected documents from teacher preparation programs that provided information on the 
selection and the responsibilities of cooperating teachers and field supervisors and on the 
expectations of and the guidance provided to student teachers. These documents included 
handbooks, manuals, and other relevant documents. In addition, NCTQ secured contracts 
between the teacher preparation programs and the school districts. To triangulate findings from 
the documents and follow-up discussions with the teacher preparation programs, NCTQ surveyed 
local school principals online or by telephone to gather additional data on student teaching. 
Based on the data gathered, NCTQ determined the extent to which each program met five 
standards and then rated the programs based on those standards. * 1 2 3 4 5 


5 The standards were as follows: 

1. “The student-teaching experience, which should last no less than 10 weeks, should require no less than five weeks 
at a single local school site and represent a full-time commitment. 

2. The teacher preparation program must select the cooperating teacher for each student teacher placement. 

3. The cooperating teacher candidate must have at least three years of teaching experiences. 

4. The cooperating teacher candidate must have the capacity to have a positive impact on student learning. 

5. The cooperating teacher candidate must have the capacity to mentor an adult, with skills in observation, providing 
feedback, holding professional conversations and working cooperatively” (Greenberg et al . , 2011, p. 3). 
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Sample Questions From the Candidate Exit Survey, 2010-2011 


Unless otherwise noted, the response options are always/almost always, frequently, occasionally, 

and rarely. 

1 . To what extent did the field supervisor share with you the expectations for your performance 
in the classroom before each observation? 

2. To what extent did the field supervisor base observation feedback on the expectations for your 
performance in the classroom? 

3. To what extent did the field supervisor provide you with a written report or checklist of his or 
her observation of your performance in the classroom? 

4. Did you ever communicate with your field supervisor by e-mail, text, or telephone call? (Yes/No) 

5. To what extent did your field supervisor respond to your communications (e.g., e-mail, text, 
or telephone call) within two school or business days? 

6. To what extent did your field supervisor offer you opportunities to reflect on your performance 
in the classroom? 

7. To what extent did your field supervisor provide multiple means for you to communicate with 
him or her, such as e-mail, telephone, texting, videoconferencing, or face-to-face interaction? 

8. To what extent did your field supervisor ask you for ways he or she can support you? 

9. The field supervisor formally observed me teaching a minimum of three times. (Yes/No) 

10. The field supervisor observed me teaching for a minimum of 45 minutes during at least three 
of my formal observations. (Yes/No) 

Adapted from the Candidate Exit Survey compiled by the Texas Education Agency and the 

Texas Comprehensive Center at SEDL, http://www.tea. state. tx.us/WorkArea/linkit.aspx?LinkIde 

ntifier=id&ItemID=21475057558dibID=2147505749. 


Process Measures Overview 

Process measures get at the substance of teacher preparation — candidate selection, course 
content and requirements, and the experiences and the supports provided to teacher candidates. 
However, a dearth of research and development on the core practices and the skills that teachers 
need to be effective limit our understanding of how teachers are best prepared for the classroom. 
Process measures should therefore be used with caution. However, each measure presents 
opportunities and challenges for providing useful information on teacher preparation program 
quality. Table 1 summarizes the opportunities and the challenges of assessing programs based 
on process measures. The appendix also summarizes the strengths and the weaknesses of 
each measure. 
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Table 1. The Opportunities and the Challenges of Using Process Measures to Evaluate Teacher 
Preparation Programs 


Opportunities Challenges 


For accountability 

• Provides more information to policymakers and 
program faculty on the research-based content and 
structures of teacher preparation programs. 

For capacity building 

• Helps faculty understand and implement research- 
based preparation processes. 

• Can identify possible gaps in the content of teacher 
preparation program coursework and clinical work. 

For equity 

• Can help determine which preparation programs 
serving high-need schools have research-based 
processes in place. 


• The research base on effective practices in teacher 
education is not currently robust enough to build a 
high-stakes assessment for accountability based on 
measuring processes. 

• There may be multiple and varied pathways to 
effectiveness. 

• Process measures may discourage innovation. 

• Process measures may require complex qualitative 
measures that are difficult to score reliably across IHEs. 


What Are Some Ways to Evaluate Programs Using the Evidence 
of Impact on Outcomes? 

Process measures provide potentially useful data on what occurs during teacher preparation but say 
little about what happens after candidates complete a program. Do program graduates demonstrate 
effective teaching practices? Are the graduates successful in producing high levels of student 
achievement? Do the graduates remain in the classroom? The answers to these questions may 
be more important for school and district leaders to know than how these results were obtained. 
Outcome measures provide insight into these questions. 

Student Achievement and the Growth of the Students of the Graduates of Teacher 
Preparation Programs 

At least 14 states are seeking to use value-added modeling (VAM) — a statistical method of measuring 
a teacher’s contribution to growth in student achievement — or other estimates of student achievement 
growth to compare teacher preparation programs (Sawchuk, 2012a). Although the exact methods and 
plans vary, the states increasingly plan to use the student achievement gains of the students of 
beginning teachers and aggregate the gains based on which in-state teacher preparation program 
recommended those teachers for certification. 6 The aggregated results can then be used to compare 
multiple teacher preparation programs. 

The strength of this approach is that test-based, value-added estimates provide a common metric 
to compare programs. Moreover, the differences among programs can be quantified using this 
approach (see, e.g., Boyd et al. , 2009b; Gansle, Burns, & Noell, 2011; Goldhaber & Liddle, 2012). 
Although value-added results provide little guidance in terms of how to actually improve programs, 
they can be used as a trigger for further action. In other words, if student growth is relatively high 
among the graduates of a particular teacher preparation program, that program can be honored 


6 


Currently, most states do not have the ability to share data with other states. 
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as well as studied further to determine what made it relatively effective. Comparing process 
measures, such as syllabi reviews or measures of field experience quality, in teacher preparation 
programs with high value-added measures may help uncover best practices in teacher preparation. 
This information can also be used to serve equity purposes to see which programs are producing 
graduates who are more effective with students of color. 

This approach is not without challenges, however. Many concerns exist regarding how VAM or other 
measures of student growth can assess individual teachers. For example, researchers have noted 
the possibilities of attribution error, bias, and unintended negative consequences. (For descriptions 
of such concerns, see Baker et al. [2010]; Darling-Flammond, Amrein-Beardsley, FHaertel, & Rothstein 
[2012]; and Kennedy [2010].) When aggregated to the program level, some of the concerns about 
the validity of these measures are reduced but not totally eliminated (Mihaly, McCaffrey, Sass, & 
Lockwood, 2012). Moreover, statewide tests do not assess the contribution of the many graduates 
from teacher preparation programs in private universities and colleges, those who teach in nontested 
subject areas, and those who teach out of state (Crowe, 2011). The implementation of value-added 
measures requires extensive data system capacity (Kukla-Acevedo, Streams, & Toma, 2009), and 
few states have fully tested and functional data systems. 7 

The nonrandom assignment of the graduates of teacher preparation programs to schools creates 
additional challenges. For example, trade-offs exist between ensuring an adequate sample size to 
make valid conclusions (which often means following graduates for 3 years into their school 
placements) and ensuring that the effects of the teacher preparation program on student 
achievement are properly disentangled from the effects of teacher learning, with behaviors 
that are reinforced by graduates’ colleagues and induction programs at their school sites 
(Kukla-Acevedo et al., 2009). 

Moreover, statewide academic achievement tests are not necessarily comprehensive measures of 
desired student academic, civic, and social-emotional learning outcomes that teacher preparation 
programs intend to produce and thus remain imperfect measures of program effectiveness 
(Henry et al., 2011a). 

Finally, if these challenges were not enough, recent studies conducted by the National Center for 
Analysis of Longitudinal Data in Education Research found little variation in teacher training program 
effects as measured by VAMs, suggesting that teacher preparation programs are more similar than 
different in their effectiveness in terms of student test scores (Goldhaber & Liddle, 2012; Koedel, 
Parsons, Podgursky, & Ehlert, 2012; Mihaly et al., 2012). Furthermore, as Koedel et al. (2012) 
stated based on their study of preparation programs in Missouri, “Virtually all of the variation in 
teacher effectiveness in the labor force occurs across teachers within programs” rather than 
between programs (p. 7). How VAMs are specified can also influence the rankings of teacher 
preparation programs. For example, Mihaly et al. (2012) found that when school fixed effects were 
included in the models, rankings changed drastically, with at least one preparation program moving 
from the bottom quartile to the top quartile after the model specifications changed. Thus, caution 
is warranted in using VAM scores to evaluate programs. As Koedel et al. argue, given the relatively 
small differences found between programs and the different variations and trade-offs associated with 
different VAM models, state-level decision makers and K-12 administrators must avoid placing too 
much weight on program-level VAM scores when making critical decisions for program accountability 
or hiring purposes (Koedel et al., 2012). 


7 


Currently, 46 states do not share teacher performance data with their teacher preparation programs (Data Quality 
Campaign, 2012). 
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State and District Teacher Evaluation Results 

To receive flexibility under the Elementary and Secondary Education Act, states must establish 
guidelines for teacher employment evaluation that “meaningfully [differentiate] teacher performance 
using at least three performance levels” (Duncan, 2011; see also U.S. Department of Education, 
2012a). Even without this requirement, states across the country have been developing teacher 
evaluation systems that result in a summative evaluation score based on some combination of 
measures of teaching practices, professional responsibilities, and student achievement growth 
(Goe, Holdheide, & Miller, 2011). Some Race to the Top states are planning to link these evaluation 
results back to the preparation programs that prepared the teachers so that they can publicly report 
comparisons among teacher preparation programs. 

Teacher evaluation results as measures of the quality of teacher preparation programs have multiple 
strengths. Compared to using only measures of student growth to assess the effectiveness of the 
graduates of teacher preparation programs, teacher evaluation results can paint a much more 
comprehensive picture of program effectiveness because they are based on multiple measures. 
In addition, evaluation results are (or will be) available for all public school teachers. Evaluation data 
can help teacher preparation programs determine whether they are producing graduates who can 
perform in ways that meet state standards and district expectations. In addition, depending on the 
depth of the analysis and the data available, teacher evaluation results can pinpoint the strengths 
and the weaknesses of their respective preparation programs. For example, a thorough review of 
teacher evaluation results may reveal that the graduates of teacher preparation programs tend 
to be strong in content knowledge but lack skills related to class management. 

Teacher evaluation results also have drawbacks. Most teacher evaluation processes and metrics 
are locally controlled (and in many cases locally bargained). Thus, evaluation results may not be 
comparable across school districts and states. This may skew comparisons among teacher 
preparation programs if the graduates from particular preparation programs are more often hired 
into school districts whose evaluation systems are less rigorous than others. It may even create 
incentives for preparation programs to help place teachers in school districts that they know do 
not have as rigorous an approach to evaluation as other school districts. Finally, these new teacher 
evaluation systems are still in their infancy, and their validity and reliability have not yet been proven 
(e.g., Bill & Melinda Gates Foundation, 2012). Making high-stakes decisions in terms of program 
accountability (much less individual accountability) based on these systems should be done with 
extreme caution. 

Surveys of Principals and Employers 

Another way to gauge the performance of the graduates of teacher preparation programs is to ask 
their supervisors (usually their principals) about the quality of graduate performance. Research has 
shown a high correlation between principal assessment and teachers’ value-added scores (Sartain 
et al. , 2011; Tyler, Taylor, Kane, & Wooton, 2010), so this may be a less onerous way to gain program 
feedback and information on program effectiveness (although there is little, if any, research showing 
the correlation between principals’ preparation program survey responses and teacher effectiveness). 
Surveys may also help principals pay closer attention to how and where their new hires are prepared. 

In addition, surveys engage stakeholders and offer local education agencies opportunities to provide 
input regarding the preparation of teacher candidates. Texas used principal surveys in the 2011-12 
school year (see Sample Questions From the Texas Teacher Preparation Effectiveness Survey: First 
Year Teachers). 
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Sample Questions From Principal Surveys to Evaluate Texas Educator Preparation Programs: 
First-Year Principal Survey 


All questions asked principals to select one of the following choices: well prepared, sufficiently 

prepared, not sufficiently prepared, or not at all prepared. 

• To what extent was this beginning teacher prepared to effectively implement discipline/management 
procedures? 

• To what extent was this beginning teacher prepared to integrate effective modeling, questioning, 
and self-reflection (self-assessment) strategies into instruction? 

• To what extent was this beginning teacher prepared to make appropriate decisions (e.g., when and 
how to make accommodations and/or modifications to instruction, assessment, materials, delivery, 
and classroom procedures) to meet the learning needs of students who have an individualized 
education program (IEP)? 

• To what extent was this beginning teacher prepared to provide appropriate ways for limited-English 
proficient students and English language learners to demonstrate their learning? 

• To what extent was this beginning teacher prepared to provide technology-based classroom learning 
opportunities that allow students to interact with real-time and/or online content? 

Adapted from the First Year Principal Survey of the Texas Education Agency at http://www.tea. 

state. tx.us/index2.aspx?id=2 1474841 63 &menu_id=2 147483671 &menu_id2=794. 


Surveys of the Graduates of Teacher Preparation Programs 

When designed and administered carefully, surveys of the graduates of teacher preparation programs 
can provide useful information to both states and the teacher preparation programs. The survey 
information can be used for program accountability, improvement, and educational equity. Many 
IHEs survey their recent graduates so that they can obtain feedback on their teacher preparation 
programs. These measures are inexpensive to distribute, but ensuring sufficient response rates 
can be a significant challenge, and timing the survey distribution properly takes careful thought 
and organization. As with other surveys, the data gathered from surveys of the graduates of teacher 
preparation programs reflect feelings of preparedness, self-efficacy, and program perceptions — not 
actual preparedness and actual program quality (Darling-Hammond, 2006). In addition, these 
surveys of graduates are rarely common instruments used by all teacher preparation programs 
(or even more than one) in a state and thus limit comparability across programs. 

One exception is an online survey used by the Center for Teacher Quality at California State 
University. The center employs an exit evaluation administered as candidates are graduating and 
also surveys graduates toward the end of their first and third years of teaching (see Sample 
Questions From the California State University System's Exit Survey ). 

The New York City Pathways to Teaching Project also uses surveys of the graduates of teacher 
preparation programs but only for research purposes. Pathways researchers have teased out 
differences in the effectiveness among programs and program features. Recent studies found 
empirical relationships between survey findings and teacher effectiveness as measured by student 
achievement outcomes (Boyd et al. , 2009b). Sample questions from the survey (see Sample Items 
From the New York City Pathways to Teaching Project First-Year Teacher Survey) may be useful to 
states as they develop surveys for program improvement, accountability, or equity. 
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Sample Questions From the California State University Exit Survey for Multiple Subject Respondents 


As a new teacher, I am (a) well prepared to begin, (b) adequately prepared to begin, (c) somewhat 
prepared to begin, or (d) not at all prepared to begin 

• To meet the instructional needs of English language learners. 

• To assess pupil progress by analyzing a variety of evidence, including exam scores. 

• To communicate effectively with the parents or the guardians of my students. 

• To know and understand the subjects of the curriculum at my grade level(s). 

• To create an environment that supports language use, analysis, practice, and fun. 

• To assist students in decision making, problem solving, and critical thinking. 

Based on your experience as a K-12 preservice teacher, how valuable or helpful were these 
elements of your Teaching Credential Program? (very, somewhat, a little, or not at all) 

• Instruction in methods of classroom teaching and management 

• Instruction in the teaching of mathematics in Grades K-8 

• Instruction in how children and adolescents grow and develop 

• My supervised teaching experiences in K-12 schools 

• Guidance and assistance from field supervisor(s) from campus 

While you were in the Teaching Credential Program, how true was each of the following 
statements? (true, mostly true, somewhat true, or not true) 

• The program provided an appropriate mixture of theoretical ideas and practical strategies, 
and I learned about links between them. 

• During the program, I saw evidence that university faculty worked closely with teachers in 
K-12 schools. 

• I taught in at least one school that was a good environment for practice teaching and reflecting 
on how I was teaching pupils. 

Adapted from the Teacher Education Exit Survey for Multiple Subject Respondents from the 
Center for Teacher Quality (2005-06) at http://www.calstate.edu/teacherquality/documents/ 
teacherprep_exit_survey_multiple.pdf. 
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Sample Items From the New York City Pathways to Teaching Project First- Year Teacher Survey 


In your preparation to become a teacher, prior to becoming a full-time classroom teacher, how 
much opportunity did you have to do the following (extensive opportunity, explored in some depth, 
spent time discussing or doing, touched on it briefly, none) ? 

• Study stages of child development and learning 

• Develop strategies for handling student misbehavior 

• Consider the relationship between education and social justice and/or democracy 

• Learn how to fill out IEPs 

• Learn ways to teach decoding skills 

• Learn how to activate students’ prior knowledge 

• Practice what you learned about teaching reading in your field experiences 

• Learn typical difficulties students have with fractions 

• Study, critique, or adapt mathematics curriculum materials 

• Study national or New York State standards for childhood mathematics 

• Learn strategies for addressing the needs of students with mild to moderate disabilities in the 
classroom 

• Learn how to encourage scientific inquiry 

Thinking about the supervision and feedback that you received during your experiences in schools 
as part of your preparation to become a teacher and prior to becoming a full-time classroom teacher, 
please rate the extent to which you agree with the following statements (response options range from 
strongly agree to strongly disagree): 

• The teachers(s) I observed were excellent teachers and worthy role models. 

• When I participated in the classroom, I got useful feedback. 

• My experiences allowed me to try out strategies and techniques I was learning in my preservice 
classes. 

Adapted from Surveys from Teacher Policy Research at http://www.teacherpolicyresearch.org/ 
TeacherPathwaysProject/Surveys/tabid/1 1 5/Default, aspx. Copyright 2005 by Teacher Policy Research. 



18 


Research & Policy Brief 


Hiring and Placement Data 

Although larger economic, social, and governmental forces are at work when it comes to the 
number of teacher vacancies in particular schools and school districts (Liu & Johnson, 2006) and 
the extent to which teacher preparation program graduates pursue those vacancies (Johnson, Berg, 

& Donaldson, 2005; Reininger, 2012), administrative data can also speak to the quality of teacher 
preparation programs. States can use hiring and placement data to determine the extent to which 
the graduates of teacher preparation programs are hired as full-time teachers after graduation. 
Nevertheless, because of the limited control that teacher preparation programs have over hiring 
and placement, this measure should not, of course, be the sole criterion of quality. 

Hiring and placement data linked to programs can further help stakeholders understand whether 
teacher preparation programs are helping to prepare candidates for schools with large proportions 
of low-income and minority students, thereby helping a state meet its equitable distribution goals. 
This information can also be useful for teacher preparation programs to learn what the labor market 
need is for particular kinds of teachers. For example, Greenberg et al. (2011) recently estimated 
that teacher preparation programs overproduce general elementary school teachers each year 
while underproducing teachers in specific subject areas. 8 

Data on Persistence in Teaching 

Inadequate teacher preparation has been cited as one of the reasons for disproportionately high 
rates of attrition among beginning teachers, in addition to poor teaching conditions, low salaries, 
a lack of discretion and autonomy, and lackluster school leadership (Darling-Hammond, 2003; 
Ingersoll & May, 2011; Loeb, Darling-Hammond, & Luczak, 2005). Although there are many factors 
at work that explain teachers’ decisions to leave their initial placements or the teaching profession, 9 
if a teacher preparation program has a disproportionately high percentage of graduates who leave 
the profession after their first 2 or 3 years of teaching, then it may be a sign that something is 
amiss in either the selection process or the preparation experience that is not helping ensure 
that teachers persist in teaching. Furthermore, if many of the graduates of a teacher preparation 
program leave the teaching field because their contracts were not renewed as a result of poor 
teaching performance, then this information should also trigger a closer look at the quality of a 
teacher preparation program. However, because the many factors influence teacher retention, 
persistence in teaching should not be the sole criterion of program quality. 

Teacher Candidate Knowledge and Skills Outcomes 

Another way to examine the quality of teaching preparation that a program provides is by looking 
at candidates’ knowledge and skills just before they graduate and are granted a state license to 
teach. For example, Title II requires that the states report on the pass rates of preparation program 
candidates on state licensure tests and encourages states to hold those programs that have the 
lowest scores accountable, either through closure or program review. To gauge candidate knowledge 
and skills, states or programs can employ either paper-and-pencil tests or more comprehensive 
performance assessments. 


8 See http://www2.ed.gov/about/offices/list/ope/pol/tsa.pdf for a nationwide listing of teacher shortage areas. 

9 See Boyd et al. (2009a); Ingersoll & Smith (2003); and Lankford, Loeb, & Wycoff (2002) for discussions of reasons for 
teacher attrition. 
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Licensure Exams. Pass rates on licensure exams have long been used in Title II reporting to 
indicate program effectiveness. All but 3 states require teachers to take licensure exams to test 
their basic knowledge and skills, pedagogy, and/or content knowledge before they are eligible for a 
license (Crowe, 2010). Some states have as many as 85 different tests — with a test for each grade 
level band and content area — and cut scores are typically determined by panels of experts who 
make estimates of the number of questions a minimally qualified candidate should answer correctly 
(Goldhaber, 2010). In part because of the subjective element in determining cut scores, cut scores 
on the same test can vary dramatically among the states (Goldhaber, 2010). However, most states 
set cut scores at or below the national median score established for a certification test; in the 
2008-09 school year, the pass rates on teacher assessments were 95 percent for traditional route 
program completers and 97 percent for alternative route program completers (U.S. Department of 
Education, 2011c). 

Moreover, research has cast doubt on whether the benefits of licensure testing are worth the cost. 

As Wilson and Youngs (2005) noted, licensure exams were not designed to predict teaching success; 
they were created to set minimum standards for teacher knowledge. The predictive validity of teacher 
certification tests is still unknown (Wayne and Youngs, 2003; Wilson and Youngs, 2005). Recent 
studies have found only modest positive relationships between teacher licensure exam scores and 
student achievement (Clotfelter, Ladd, & Vigdor, 2007; Goldhaber, 2007). In addition, Goldhaber 
(2007) found numerous instances where teachers who failed to pass the exam but were emergency 
certified turned out to be highly effective in the classroom. 10 There is also emerging evidence that 
licensure exams are not good at predicting whether such teachers will be effective with all students, 
particularly with male students of color (Goldhaber & Hanson, 2010). In other words, licensure 
exams may have a disparate impact on not only teachers of color but also their students. 

Other Written Tests of Candidate Knowledge and Skills. Educational researchers have used other, 
potentially more predictive tests of teacher knowledge and skills to measure the impact of teacher 
preparation programs on teacher candidate outcomes. States, or coalitions of states, may consider 
adopting these tests for teacher preparation program accountability and improvement. 

For example, researchers at the University of Michigan and Harvard University developed the 
Mathematical Knowledge for Teaching (MKT), a bank of multiple-choice items to measure both 
subject matter knowledge and pedagogical content knowledge. Recent studies using this instrument 
have found that teachers’ MKT scores were strongly related to the mathematical quality of their 
instruction (Hill & Lowenberg Ball, 2009). 

Performance-Based Assessments. Unlike written tests, performance-based assessments, such as 
portfolios, artifacts, and teaching exhibitions, capture how teachers and teacher candidates apply 
what they have learned to their teaching practices. Performance-based assessments are often 
considered more authentic and more contextualized assessments of teacher practices than written 
tests (Darling-Hammond & Snyder, 2000). In multiple studies, teachers and teacher candidates 
reported that the process of completing performance-based assessments actually improved their 
teaching practices (Darling-Hammond, 2010; Darling-Hammond & Snyder, 2000; Pecheone, Pigg, 
Chung, & Souviney, 2005). That being said, there is little research that examines the correlation 
between teacher performance-based assessment scores and student achievement. 


10 


An emergency certified teacher is a teacher who received a temporary teaching certificate but did not meet the state’s full 
certification criteria. 
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Despite the strengths of performance-based assessments, using performance-based assessments 
on a large scale requires considerable resources. First, support is critical to enable teacher candidates 
and teachers to complete and fully benefit from the process. According to Pecheone et al. (2005), 
performance-based assessments “presume the existence of a supportive and collegial environment 
to promote reflection, are sensitive to context, and to the support new teachers receive” (p. 168). 
Second, considerable time is needed to both prepare entries for performance-based assessments 
and then score them (Darling-Hammond, 2010). Third, the cost of implementing and scoring 
performance-based assessments is costly. For example, the cost of the new Teacher Performance 
Assessment (TPA) is estimated to be $300 (Butrymowicz, 2012; Evergreen State College, 2012). 
Although the thoughtful use of technology may reduce overall costs overtime, additional research 
is needed to make informed decisions about using performance-based assessments to evaluate 
teacher preparation programs (Pecheone et al., 2005). 

TPA is subject specific and is available in at least 13 subjects (Pearson, 2011). The developers say 
that it is aligned with state standards, the Interstate Teacher Assessment and Support Consortium 
standards, the Common Core State Standards, and the Specialized Professional Association standards 
(American Association for Colleges for Teacher Education [AACTE], 2012b). According to AACTE 
(2012a), TPA 

■ “Measures candidates’ readiness for teaching and will be predictive of a candidate’s success 
in affecting student achievement 

■ Creates a body of evidence of teaching performance 

■ Contributes evidence for licensure decisions (in combination with other measures) 

■ Provides a consistent measure of candidate performance across teacher preparation programs 

■ Supports candidate learning and development of high-leverage teaching practices 

■ Measures candidates’ ability to differentiate instruction for diverse learners, including English 
language learners and students with disabilities 

■ Improves the information base for accreditation of teacher preparation programs” 

TPA uses evidence from three to five lessons that teachers deliver to students from one class as the 
basis for a summative score. Evidence includes video clips of instructions, lesson plans, student 
work samples, an analysis of student learning, and reflective commentaries (Pearson, 2011). 
Teachers submit these items via an electronic platform. Trained scorers who have recently worked 
as college, university, or PK-12 educators then evaluate these portfolios of evidence (Pearson, 
2012). Field testing occurred in spring 2012, and developers plan to have the final assessment 
available for use in fall 2012 (Pearson, 2011). 11 Over 25 states and almost 200 teacher preparation 
programs comprise the Teacher Performance Assessment Consortium (AACTE, 2012b), but some 
teacher candidates and educators remain skeptical that the assessment will accurately capture 
teacher practices (Winerip, 2012). 

Teacher educators at the University of Michigan’s School of Education (UMSOE) are also developing 
a potentially more practice-based performance assessment modeled on the clinical preparation 
for doctors. Still under development, the TeachingWorks set of performance assessments will 
“focus specifically and in detail on teachers’ performance of high-leverage practices and on their 
understanding of the content they teach,” according to the developers (TeachingWorks, 2012). It 
will likely build on the performance assessments that UMSOE is currently beginning to use, which 


ii 


Some IHEs are charging candidates fees for participating in the pilot, and other states have sought and received 
foundation-sponsored grants to reduce costs. Pearson will start charging candidate fees (as much as $300) in fall 2012. 
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include tasks that are based on medical training assessments. For example, in the “standardized 
patient” assessment task, a volunteer “plays" a patient and presents symptoms to medical students 
who are then assessed on their diagnostic ability. Likewise, teacher educators at UMSOE have 
someone “play” a K-12 student and present the same kind of teaching situation to different 
teacher candidates. The interactions between the teacher candidate and the “student" are 
videotaped and later evaluated by UMSOE faculty for the candidates’ ability to understand and 
respond appropriately to the situation (J. DeMonte, personal communication, February 24, 2012). 

The TeachingWorks group is also seeking to develop a common curriculum for professional teacher 
training that focuses on what the evidence says are high-leverage practices and the subject matter 
knowledge that teachers need to teach effectively. The data that will be collected from these and 
other efforts to evaluate the outcomes of teacher preparation programs promise to build the 
evidence base for what works in preservice preparation — something that is sorely needed. 

Outcomes Measures Summary 

In summary, the options for evaluating teacher preparation programs based on program outcomes 
have great potential for providing tremendously important evidence on which to base improvement, 
accountability, and equity decisions. Nevertheless, each measure has drawbacks that need to be 
considered. Table 2 is a compilation of the many opportunities and challenges that exist for this work. 
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Table 2. The Opportunities and Challenges of Evaluating Teacher Preparation Based on Outcomes 


Opportunities Challenges 


For accountability 

• Compared to input measures, 
outcomes measures provide stronger 
and more meaningful evidence on 
which to base decisions to sanction or 
close ineffective teacher preparation 
programs and reward effective ones. 

For capacity building 

• For preparation programs: Can provide 
teacher preparation programs with 
evidence-based feedback in how 

well they are preparing teachers 
to be successful in their teaching 
assignments. 

• For schools: Can assist in hiring, 
providing school leaders with 
information on the teacher 
preparation programs where they 
would be more likely to find teacher 
candidates who would be successful. 

• May incentivize teacher preparation 
programs to develop and maintain 
collaborative partnerships with 
school districts. 

For equity 

• Given the right levers, can incentivize 
teacher preparation programs to 
better prepare teacher candidates for 
high-need schools and traditionally 
underserved populations. 

• School districts can see which teacher 
preparation programs better prepare 
teachers for high-need schools and 
hire those graduates. 


Overall challenges 

• Teacher preparation programs change over time, so the feedback 
that programs get from evaluations based on graduate outcomes 
2 or 3 years later may no longer be indicative of current 
program quality. 

• Size matters. Small teacher preparation programs may have large 
fluctuations in aggregated graduate outcomes from year to year as 
an artifact of size, not necessarily of quality. 

• Although there is currently no evidence that this is occurring, 
measuring outcomes may provide incentives for teacher 
preparation programs to encourage graduates to apply to schools 
where they are more likely to be successful (i.e., higher 
socioeconomic status school districts or high-functioning schools 
with strong professional learning communities). 

• Teachers are often prepared through more than one preparation 
program, so attributing their outcomes to a particular program can 
be problematic. 

• State teacher data systems may lack capacity to link teachers to 
specific preparation programs, much less contain data on outcomes. 

• Graduate mobility. Gathering outcomes data on graduates who 
leave the state can require additional resources. 

• Collecting comparable data from graduates of preparation programs 
who teach in private schools may be difficult to impossible. 

In terms of using graduates’ comprehensive teacher 
evaluation results 

• Few K-12 school districts have mature, valid, and reliable teacher 
evaluation systems that could provide valid information on the 
effectiveness of the graduates of teacher preparation programs. 

• Local variation in evaluation system design and implementation 
means that there will be variability in the rigor of teacher 
evaluation, which may be difficult to account for when comparing 
teacher preparation programs. 

• Few states currently have adequate data systems to link teacher 
evaluation results back to teacher preparation programs. 

In terms of using value-added data 

• Few states currently have adequate data systems to link teacher 
effectiveness data back to teacher preparation programs. 

• VAMs continue to be criticized for instability, bias, attribution error, 
and so forth; however, aggregated to the program level, these 
problems may be less concerning. 

• VAMs act as signals of performance but do not provide in-depth 
data that can be used to inform improvements in teacher 
preparation programs. 

• Recent studies have found more variation among teachers’ VAM 
scores within programs than between programs, thus limiting their 
use as a way to differentiate program effectiveness. 
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How Are States on the Forefront of Change Approaching 
the Evaluation of Their Teacher Preparation Programs ? 12 

In recent years, six states have led the way in changing how they evaluate the effectiveness of 
teacher preparation programs. In this section, we profile the efforts of these states in varying 
stages of implementation to provide real-world examples of how some states are combining 
measures to construct a more complete picture of the quality of teacher preparation programs. 
Many of these states continue to wrestle with the trade-offs associated with each measure and 
are working through challenges of implementing new evaluation systems. That being said, their 
lessons learned and ongoing efforts can help inform current and future efforts. Only two states 
(Louisiana and Texas) intend to use these measures for accountability purposes as of February 
2012 (Sawchuk, 2012a), but all states intend to provide greater information to potential teacher 
candidates, teacher preparation programs, and the public. 13 Table 3 summarizes the content of 
the six profiles that follow. 


Table 3. Summary of Sample States Using Processes and Outcomes to Evaluate 
Teacher Preparation 


State 

Measures Selected/Used 

Recent Accomplishments 

Challenges 

Louisiana 11 

• Review of reform 

• The first state to use 

• Disaggregating VAM data 


proposals 

VAM in its assessment 

to provide useful, 


• NCATE accreditation 

of teacher preparation 

actionable data 


• Praxis scores 

programs 

• Accumulating sufficient 


• Surveys of the graduates 

• Adopted one VAM that 

data to permit public 


of teacher preparation 
programs 

• The change in the 
number of students 
completing teacher 
preparation programs 

• Value-added scores of 
graduates 

• Authentic university- 
school partnerships 
(selected but not 
implemented) 

will evaluate both 

dissemination of the 


teachers and teacher 
preparation programs 

data 


12 The authors would like to thank Julie Orange and Eileen Fisher (Florida State Department of Education), Martha Hendricks-Lee 
(Ohio Association of Colleges for Teacher Education), Tom Bordenkircher (Ohio Board of Regents), Jeanne Burns (Louisiana 
Board of Regents), Janice Lopez (Texas Education Agency), Emily Carter (Tennessee Department of Education), Elissa 
Brown (North Carolina Department of Public Instruction), and Grant W. Simpson (Texas Association of Colleges for 
Education) for their assistance in preparing these examples. 


13 A recent Education Week article featured a chart profiling states’ plans to report value-added information on their teacher 
education programs. The District of Columbia and 13 states — Delaware, Florida, Georgia, Hawaii, Louisiana, Maryland, 
Massachusetts, North Carolina, New York, Ohio, Rhode Island, Tennessee, and Texas — currently or plan to soon report 
value-added data. Of those, 6 states plus D.C. plan to use these measures for accountability purposes: District of 
Columbia, Louisiana, Maryland, North Carolina, New York, Rhode Island, and Texas (Sawchuk, 2012a). 
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State 

Measures Selected/Used 

Recent Accomplishments 

Challenges 

Texas” 

• Certification examination 

• Distributed principal 

• Training principals on 


pass rates 

survey to all principals in 

importance of survey 


• Appraisals of graduate 

May 2011 

• Gathering student 


performance 

• Made requirements for 

achievement data in 


• Value-added scores of 

certification programs 

nontested grades and 


the graduates of teacher 

much more uniform 

subjects 


preparation programs 

across teacher 

• Linking teachers to 


• Survey and extant data 
on the frequency, the 
duration, and the quality 
of field supervision 

preparation program 
types 

students 

Tennessee 3 

• Placement and retention 

• Published reports in 

• Assessing the 


rates 

• Praxis II results 

• Value-added scores of 
the graduates of teacher 
preparation programs 

2011 

effectiveness of the 
graduates of teacher 
preparation programs 
who are working in 
nontested grades and 
subjects 

• Gathering data on 
program completers 
working in private or 
out-of-state schools 

North Carolina 3 

• Value-added scores of 

• Released reports in 

• Gathering data on 


the graduates of teacher 
preparation programs 

2010 and 2011 

program completers 
working in private or 
out-of-state schools 
• Assessing the 
effectiveness of 
out-of-state teacher 
preparation programs 

Ohio 3 

• Results ofTPA 

• Created a portal on the 

• Finalizing what the 


• Praxis II scores 

Ohio Board of Regents 

higher education report 


• Additional examination 
results 

• Appraisals of the 
graduates of teacher 
preparation programs 

• Surveys of teacher 
candidates 

• Feedback from educator 
residencies 

• Value-added scores 

website to house 
candidate and employer 
surveys 

card will look like 

Florida 3 

• In process 

• Currently preparing 
performance targets for 
the new teacher 
preparation program 
accountability system 

• Organizing stakeholder 
meetings 

• Locating and leveraging 
expertise in assessing 
the quality of teacher 
preparation programs 

3 Denotes the state is a Race to the Top winner. b Denotes the state uses or intends to use data for accountability purposes. 


Research & Policy Brief 


25 


Louisiana 

Louisiana has been lauded as having the most advanced and comprehensive student and educator 
data system to date (Russell & Wineburg, 2007). Since 2001, Louisiana’s IHEs have engaged in 
multiple efforts to demonstrate effectiveness at four different levels identified by Louisiana’s Blue 
Ribbon Commission on Teacher Quality (Louisiana Board of Regents, n.d.): 14 

■ “Level 1: effectiveness of planning (redesign of teacher preparation programs) 

■ Level 2: effectiveness of implementation (NCATE & PASS-PORT) 

■ Level 3: effectiveness of impact (teacher preparation accountability system) 

■ Level 4: effectiveness of growth in student learning (value-added teacher preparation 
program assessment)" 

Level 1. To demonstrate the first level of effectiveness, IHEs were required to redesign their programs 
to address Louisiana’s new certification standards. These redesigns coincided with the redesign 
of the standards set by the Office of Elementary and Secondary Education and progressed from 
2002 to 2005. State and national experts reviewed the proposals for new or redesigned teacher 
preparation programs, evaluated them, and then made recommendations for approval (Noel), Burns, 

& Gansle, 2009). 

Level 2. All teacher preparation programs were required to become accredited by NCATE and 
assess the knowledge, the skills, and the dispositions of teacher candidates using PASS-PORT, 
a Web-based performance-assessment system (Russell & Wineburg, 2007). 

Level 3. The Blue Ribbon Commission on Teacher Quality also created an accountability system 
for teacher preparation programs, which it implemented up until Hurricane Katrina in 2005 
(J. Burns, personal communication, February 17, 2012). As part of this system, the Louisiana 
Board of Regents (2003-04) generated a report for each IHE. The 2003-04 reports included the 
following information: 

■ The number of students in the teacher preparation program 

■ The number of students who participated in supervised student teaching or internship 
experiences 

■ The number of faculty who supervised student teaching and internship experiences 

■ The student-to-f acuity ratio for student teaching and internship experiences 

■ The average number of hours per week, the total number of weeks, and the total number of 
hours that the school requires for student teaching 

■ Praxis scores (part of the institutional performance measure) 

■ The results from a graduate satisfaction survey, using 2 years of data (part of the institutional 
performance measure) 

■ The change in the number of students who completed teacher preparation programs (the 
quantity measure 

The Board of Regents combined some data points into measures of institutional performance and 
quantity. Using these measures, they created an institutional score that converted into one of six 
preparation performance labels (Louisiana Board of Regents, 2003-04). 


14 In April 1999, the Louisiana governor, the Board of Regents, and the Board of Elementary and Secondary Education created 
the Blue Ribbon Commission on Teacher Quality. This group consists of “thirty-six state, university, district, school, and 
communication leaders” (Louisiana Board of Regents, n.d.). 
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Hurricane Katrina impacted the implementation of additional elements of the accountability system, 
and the state is slowly rebuilding its system. Although Louisiana has not yet implemented a new 
accountability system, such as the one prior to Hurricane Katrina, it has investigated the utility 
of data through additional studies. For example, a study conducted with funds from the Carnegie 
Corporation of New York used VAM and found that the graduates of teacher preparation programs 
are more effective at teaching some grades than others (J. Burns, personal communication, 
February 17, 2012). This finding suggests the new Louisiana model may need to incorporate 
value-added data divided by grades, grade bands, or subjects to really understand the effectiveness 
of their teacher preparation programs. In addition, when looking at how teachers responded to 
the survey, investigations revealed that teachers with the highest value-added scores had the 
lowest ratings for their teacher preparation programs. Jeanne Burns (personal communication, 
February 17, 2012) explained that this disconnect raised validity concerns that the Board of 
Regents will need to consider when revising its accountability system. 

Level 4. As Noell and Burns (2006) noted, the reviewers of teacher preparation programs found that 
IHEs lacked the capacity to develop rigorous assessments and conduct individual analyses of the 
effects of their graduates given the “geographic dispersion of graduates, the variety of content and 
grade levels that graduates teach, the heterogeneity of the students the graduates teach, and the 
limitations of finite resources” (p. 39). This finding led the Blue Ribbon Commission for Educational 
Excellence (in Los Angeles) to recommend the creation of a statewide system for assessing the 
impact of new graduates, namely by using VAM (Noell & Burns, 2006). 

Louisiana piloted a VAM created by Dr. George Noell from 2003 to 2006 and then fully implemented 
the model in the 2006-07 school year (Gansle, Burns, & Noell, 2011). Using this model, Louisiana 
reported the effectiveness of teacher preparation programs along five effectiveness levels, comparing 
the performance of the graduates of teacher preparation programs to other new teachers and to 
experienced teachers. 

Since the introduction of VAM in Louisiana, some teacher preparation programs in Louisiana have 
used the data to help inform changes to their programs. For example, after viewing data showing 
that students taught by the graduates from the University of Louisiana at Lafayette struggled with 
essay questions, the school changed its requirements for teacher candidates to include more 
writing instruction in introductory English classes. In response to other data, the school set up 
faculty teams to examine the teacher preparation curriculum, changed the sequencing of the 
elementary mathematics courses, and increased the amount of time that faculty members must 
observe student teachers (Sawchuk, 2012a). 

Beginning in the 2012-13 school year, all teacher evaluations in Louisiana must include a 
value-added measure. This measure, the Louisiana Department of Education value-added teacher 
evaluation model, will also be used to evaluate teacher preparation programs going forward. Using 
the value-added scores of first- and second-year teachers, the Board of Regents will calculate the 
mean value-added scores for IHEs and alternative teacher preparation programs that prepare 
new teachers. Gansle, Burns, and Noell (2011) argued that using the same metric used in teacher 
evaluations will increase cost efficiency, align supports given to K-12 systems and teacher 
preparation systems, enable Louisiana to communicate more clearly to the public, provide more 
information about student test histories and discipline histories, and permit subgroup analyses. 
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Although the state released mean scores for the 2010-11 school year where sufficient numbers of 
graduates existed, Louisiana is still in the process of creating benchmark levels based on the new 
VAM. As of yet, few teacher preparation programs have enough data to warrant public reporting 
(Gansle, Noell, & Burns, 2012); thus, the implementation of VAM will take time. 

Texas 

In 2009, the Texas legislature passed Senate Bill 174, which requires the evaluation of teacher 
preparation programs. Since then, the state has revised the Texas Education Code (TEC) in 
accordance with Senate Bill 174. 15 The new code requires that the Texas State Board for Educator 
Certification (SBEC) annually review the accreditation status of each teacher preparation program 
and assign one of five statuses (not rated, accredited, accredited-warned, accredited-probation, and 
not accredited-revoked) to each IHE based on the results of the review (TEC §21.0451). In addition, 
information and data about each IHE must be posted on the state’s website (TEC §21.0452). When 
conducting reviews, SBEC measures the effectiveness of teacher preparation programs according 
to the following four measures (J. Lopez, personal communication, May 21, 2012): 

1. The pass rate performance standard of certification examinations of teacher preparation 
program candidates 

2. The results of appraisals of beginning teachers by school administrators 

3. The improvement in student achievement of students taught by beginning teachers for the 
first 3 years following certification 

4. The frequency, the duration, and the quality of field supervision of beginning teachers 

Certification Examinations. Teacher candidates in Texas are required to pass the Texas 
Examinations of Educator Standards classroom certification tests. Texas uses these results to 
assess the pass rate performance for teacher preparation programs. SBEC set progressive pass 
rate performance standards ranging from a 70 percent pass rate standard in the 2009-10 school 
year to 80 percent in the 2011-12 school year (Texas Education Agency, n.d.a). 

Appraisals. SBEC piloted an electronic survey to all principals in 2010 (see page 11 for examples 
of items; Texas Education Agency, n.d.b). The extensive survey asks principals to assess the 
performance of new teachers and their preparedness to be classroom teachers. The state 
included this measure in the system in part to gather information about new teachers in 
nontested grades and subjects for which no student achievement data were available. After 
reviewing the pilot data and receiving stakeholder input, SBEC revised the survey and distributed 
it in May 2011. The responses from 42,000 principals will be used to determine the impact of 
teacher preparation programs and will be reported for the 2011-12 school year. 

Despite revisions and gradual implementation, the survey remains imperfect. For example, recent 
results demonstrate a lack of variability in the responses. Nearly all principals assigned teachers 
ratings of highly skilled or adequately skilled. Given the skew of the data, few decisions or 
conclusions can be made about the quality of teacher preparation programs. In addition, IHEs, 


15 For the relevant state code, see TEC §21.041 and §21.045 added §21.0451 and §21.0452. 
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which receive only aggregated data, cannot use the results of the appraisal surveys to make 
informed decisions on improving their programs. For example, the deans of teacher preparation 
programs would not be able to tell whether graduates from their elementary or secondary 
programs received negative ratings. 

According to Grant Simpson of the Texas Association of Colleges for Teacher Education, principals 
need additional training on the survey. In large high schools, principals or their designees may need 
to complete upward of 20 surveys of recent graduates of teacher preparation programs. Completing 
this task is time consuming, but for the results to be relevant, the survey must also be accurate. 
Thus, principals may need a greater understanding of the implications of the survey (G. W. Simpson, 
personal communication, May 23, 2012). 

Student Achievement. The Texas Education Agency contracted with the Lyndon B. Johnson School 
of Public Affairs at the University of Texas-Austin to develop a metric that includes an observational 
component based on surveys of principals and a value-added component based on student 
performance on the state’s standardized tests. In addition, the state engaged stakeholders 
in designing the metric by convening both IHEs and school practitioners to provide feedback 
and offer ideas. The state also invited statistical experts from various IHEs to serve in a separate 
measurement assessment group. The resultant model is a complex design that accounts for multiple 
confounding variables (G. W. Simpson, personal communication, May 23, 2012). The metric does 
not measure the effectiveness of any individual teacher. Rather, it estimates the aggregate 
achievement of all students taught by the graduates of a teacher preparation program. In the 
VAM analysis, individual teachers provide the link between a teacher preparation program and 
the students taught by its graduates (J. Lopez, personal communication, May 21, 2012). 

Only student achievement results of beginning teachers who teach a Texas Assessment of 
Knowledge and Skills— related class in Grades 4-8 for reading and Grades 4-10 in mathematics 
are included in the model. Consequently, information about student achievement in other areas, 
such as science and social studies, is missing from the state’s evaluation system. In addition, 
although the model matches teachers and students based on class rosters, it does not always do 
so accurately. For example, Simpson recalled an instance where student achievement results for 
mathematics and reading were attributed to students’ homeroom teachers, who taught neither 
subject (G. W. Simpson, personal communication, May 23, 2012). Adding another layer of complexity, 
the state’s assessments changed in 2012; consequently, the value-added results for the first two 
years will be based on different tests. 

Recognizing the limitations of VAM in its current state, the Texas Education Agency intends to have 
a two-year pilot of this new measure of teacher preparation programs because of its newness and 
recent changes in the state assessment program (Texas Education Agency, n.d.a). The state has 
assured teacher preparation programs that data from the first few years would not be released 
because VAM is not yet reliable given small sample numbers. However, because the state must 
present this information to SBEC, members of the Texas Association of Colleges for Teacher 
Education worry that the state’s assurances hold little weight. After information is presented 
to SBEC, it will become public information and thus be subject to “gross misinterpretation” 

(G. W. Simpson, personal communication, May 23, 2012). 

Field Supervision. The fourth standard assesses the frequency, the duration, and the quality of field 
supervision. The Texas Administrative Code requires that teachers be observed for 45 minutes at 
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least 3 times within their 12-week student-teaching or clinical experiences (19 TAC §228. 35(f)). After 
each observation, teacher candidates must engage with the observer in a conference and receive 
written feedback. Texas sets yearly compliance percentages for frequency and duration (in the 
2011-12 school year, the standard was 95 percent). Beginning in the 2011-12 school year, SBEC 
measured the quality of field supervision via an exit survey, which will be distributed to teacher 
candidates as they complete their teacher preparation programs (Texas Education Agency, n.d.a). 

In addition to the four measures listed, teacher preparation programs must also submit data on 
the number of candidates who apply for admission into the program, the number admitted into 
the program, the number retained throughout the course of the program, the number of program 
completers, the number of candidates employed in the profession after program completion, and the 
number of candidates retained in the profession (TEC §21.045). Although most of these statistics 
are not new to teacher preparation programs, some programs contend that retention in the profession 
is not directly related to the quality of the teacher preparation program attended. Simpson explained, 
"Retention is a reflection of workforce conditions in the field, not necessarily a lack of quality 
preparation” (personal communication, May 23, 2012). 

The new evaluation of teacher preparation programs may have its limitations, but it shows promise. 
According to Simpson, the new sources of data, if improved, could provide both the state and 
teacher preparation programs with more information about program quality and areas in need of 
improvement. In addition, the Texas Association of Colleges for Teacher Education is encouraged 
by the growing responsiveness of SBEC. In recent years, SBEC has made the requirements much 
more uniform across traditional and alternative certification programs, providing greater comparison 
across programs. In addition, it is much more attuned to failing teacher preparation programs and is 
willing to take action to stop the perpetuation of ineffective programs. 

Texas’ new review process of teacher preparation programs combines multiple measures of teacher 
preparation program outcomes. Given the newness of these measures, the state has committed to 
continued development and refinement of surveys and value-added measures, suggesting that the 
development of new evaluation processes is not a one-shot deal. Despite these assurances, however, 
concerns persist regarding the use and the dissemination of decontextualized or weak data to 
evaluate teacher preparation programs. 

Tennessee 

In 2007, the Tennessee General Assembly passed a bill that directed the State Board of Education 
to assess the effectiveness of teacher preparation programs. The law requires that the State Board 
of Education collect and report data on placement and retention rates, Praxis II results, and teacher 
effect data based on Tennessee Value-Added Assessment System scores (Tennessee Higher 
Education Commission, 2011). 

The most recent set of reports, from November 2011, included general and demographic information 
for each IHE in Tennessee, including the school’s accreditation status, the largest endorsement 
areas in approved teacher education programs, and the average academic information of the 
teacher preparation program candidates. Based on data in the state’s Personnel Information 
Reporting System, the state reports the percentage of graduates from teacher preparation 
programs that enter and remain in the teaching field at the 1-, 2-, 3-, and 4-year marks. The 
reports also included the pass rates on teacher assessments and a map of where 2009-10 
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graduates taught in 2010-11. Finally, the reports include comparisons of the effectiveness of 
beginning teachers — teachers with 1 to 3 years of experience — to teachers statewide (Tennessee 
Higher Education Commission, 2011). These reports are currently available online for users to 
view. In addition, Tennessee aims to provide teacher preparation programs with individualized 
feedback identifying the strengths and weaknesses of their programs in fall 2012 (E. Carter, 
personal communication, May 15, 2012). 

Despite recent success in gathering information on teacher preparation programs, the Tennessee 
Higher Education Commission reported some limitations in its data. VAM data were available only 
for those teachers who worked in tested subjects and grades — about 35 percent of the 2009-10 
teacher candidates from teacher preparation programs. In November 2012, the commission aims 
to have more VAM scores available as part of the new teacher and principal evaluation system. 
Another limitation of the data is that it includes only those graduates who are teaching in public 
K-12 schools in Tennessee; program completers who work in private schools or in out-of-state 
schools are currently excluded from the analyses (Tennessee Higher Education Commission, 
2011). Tennessee’s recent reports provide the public with increased information about the 
effectiveness of teacher preparation programs, but the availability of data remains limited. 

North Carolina 

The Carolina Institute for Public Policy at the University of North Carolina has released yearly 
reports on the effectiveness of teacher preparation programs using the value-added scores of the 
graduates. These reports are produced and published independently of the state. The most recent 
report drew on 2.6 million test scores of 1.7 million students in the state and data on more than 
28,000 teachers with fewer than 5 years of experience (Henry, Thompson, Fortner, Bastian, & 
Marcus, 2011b). 

North Carolina used two models to calculate teacher effects. Using a year-to-year multilevel model, 
researchers generated an estimate of effect for each of the state’s public IHEs by comparing the 
graduates of each IHE to the aggregate of all teachers in the state. Researchers then used each 
teacher preparation program as a reference group to calculate estimates of differences between 
each program and 12 other categories of teacher preparation (e.g., University of North Carolina 
graduate degree prepared, North Carolina private university undergraduate prepared, and TFA). 
When calculating the value-added scores, researchers included multiple covariates, including 
teacher, student, school, and classroom characteristics. These covariates served as controls 
that allowed the researchers to generate comparable estimates of teacher effects on student 
achievement across schools and teacher preparation programs (Henry et al . , 2011b). 

The reports by the Carolina Institute for Public Policy attempt to provide more information about the 
effectiveness of teacher preparation programs (see Henry et al., 2011a), but challenges remain. 
Data gathering is limited to public universities in North Carolina only; thus, the reports capture 
the effectiveness of only 15 of the state’s 48 teacher preparation programs (U.S. Department 
of Education, n.d.). In addition, 23 percent of all teachers in North Carolina were graduates from 
out-of-state teacher preparation programs in the 2009-10 school year (Carolina Institute for 
Public Policy, 2012). Therefore, the institute’s current research provides limited information on 
the effectiveness of teacher preparation programs in North Carolina. 
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All the reports are publicly available but are difficult to access, both physically 16 and in readability. 
North Carolina intends to create educator preparation program report cards that will align with the 
current K-12 school report cards and accessible to the public (U.S. Department of Education, 
2012c). Information for the educator preparation program report cards will be gathered from 
the annual North Carolina IHE Performance Report and will be hosted on the North Carolina 
Department of Public Instruction website. 

Ohio 

In 2009, the 128th General Assembly of Ohio passed H.B. 1, which directed the Ohio Board of 
Regents to establish a system for evaluating the state’s teacher preparation programs (University 
System of Ohio Board of Regents, n.d.). In response to the legislation, the Ohio Board of Regents 
worked with stakeholders, including deans from teacher preparation programs, to arrive at a 
consensus around key aspects of the evaluation system for teacher preparation programs. It has 
also partnered with Ohio State University to develop surveys and worked with Jim Cibulka, the 
CAEP president, to conduct a psychometric analysis of data to determine validity and reliability 
(Bordenkircher, 2012). 

Ohio is developing indicators with 14 variables to be included in the state’s higher education 
report card. The report cards will include data on teacher success on various measurements, the 
commitment to excellence and innovation by teacher preparation programs, and other measures 
of teacher preparation quality. By the end of 2012, the state aims to have data on 6 variables: 

1. TPA results provided by Pearson 

2. Praxis II examination results 

3. Results of additional examinations with cut scores, such as foreign language exams 

4. Employer satisfaction surveys 

5. Teacher exit surveys 

6. Feedback from educator residencies from the first year of pilots 

Ohio is still finalizing the details regarding what its higher education report card will look like, but 
each IHE will receive a score based on a formula created by the Ohio Department of Education. 

No cut score exists, but the public will have access to the data and be able to see where each 
IHE ranks (Bordenkircher, 2012). 

To facilitate data collection for teacher preparation programs, the Board of Regents houses a 
new portal on its website to store candidate and employer surveys. Teachers and employers will 
complete the surveys on the website. Website programming will tally the results, and the state will 
communicate the results to teacher candidates, preparation program providers, and employers 
(Bordenkircher, 2012). 


16 The reports are located on two subpages of the Department of Public Policy’s Web page on the University of North Carolina 
(UNC) at Chapel Hill website: http://publicpolicy.unc.edu/research and http://publicpolicy.unc.edu/research/publications- 
presentations-and-reports/?searchterm=teacher%20portals. For users to access these reports, they need have to know 
that they exist and navigate the UNC website until they find them. 
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Ohio’s recent changes reflect multiple efforts by the state and the Ohio Teacher Quality Partnership 
to increase its understanding of various measures. Prior to the passage of H.B. 1, all of Ohio’s 
51 IHEs offering teacher preparation programs teamed together to create the Ohio Teacher Quality 
Partnership. Supported by the Ohio Board of Regents, the Ohio Department of Education, and 
private corporations, the Ohio Teacher Quality Partnership conducted 5 studies aimed at providing 
IHEs and policymakers with greater understanding of the effectiveness of teacher preparation 
programs. Specifically, the Ohio Teacher Quality Partnership aimed to understand the following 
(Lasley, Siedentop, & Yinger, 2006): 

■ The link between teacher preparation experiences and student achievement 

■ Which teachers have the greatest impact on students 

■ The relationship between institutional practice and student achievement 

■ The impact of teachers who were licensed through alternative and traditional pathways on 
student achievement 

According to Martha Hendricks (personal communication, February 14, 2012) from Wilmington 
College, the research project struggled to obtain teacher-level VAM data to conduct the studies. 
The Ohio Department of Education could not provide teacher-level VAM data to the Ohio Teacher 
Quality Partnership because of inaccuracies in the reporting system. At that time, Ohio calculated 
VAM data at the grade level, not the teacher level. When the Ohio Teacher Quality Partnership 
compared the requirements and the offerings of teacher preparation programs statewide, it 
struggled to find significant differences among the programs. One possible explanation for the 
lack of discrimination in the data set could be that the national and state regulations for teacher 
preparation programs look very similar on paper (Franco & Hendricks, 2012). 

Now, the Ohio Department of Education will provide aggregate student growth and achievement 
value-added data results linked to each teacher education preparation program. Ohio’s most recent 
efforts demonstrate that the creation of teacher preparation programs require collecting and sharing 
data from multiple sources. Ohio will collect data from teacher preparation programs, employers, 
teacher candidates, the Ohio Department of Education, and test providers. The state anticipates 
that CAERthe state’s accreditation body, teacher preparation programs, and individuals will benefit 
from the additional and centrally located data that the report cards will provide. Users will be able 
to access the data and make informed decisions based on a greater breadth of information 
(Bordenkircher, 2012). 

Florida 

Florida is in the preliminary phases of revising the evaluation of its teacher preparation programs. 

As part of its Race to the Top work, the Florida Department of Education (FLDOE) formed a 
committee (the Teacher and Leader Preparation Implementation Committee [TLPIC]) consisting 
of statewide stakeholders to provide input on how to improve its teacher preparation programs. 
This committee meets regularly in-person or via webinars and/or conference calls to collaborate on 
all recommendations and decisions. The committee has 24 members that include K-12 teachers, 
school leaders, district administrators and superintendents, and IHE faculty. The TLPIC meetings 
are open to the public and are webcast on the FLDOE website. 

The committee’s focus since November 2011 has been on selecting and setting performance 
measures to evaluate the quality of its teacher preparation programs. During recent meetings, 
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FLDOE presented TLPIC with requested data and provided explanations of the data to help the 
committee evaluate the potential uses of each data element toward setting potential performance 
targets. FLDOE also tasks committee members with homework, such as individually submitting 
their recommendations for weighting data elements (FLDOE, 2012). Submitted information is then 
compiled and presented back to TLPIC for further discussion and decision making. In May 2012, the 
committee reconvened and made preliminary recommendations for performance targets for three 
levels of continued approval and one for denial. These recommended performance targets for the 
new teacher preparation program accountability system will be finalized during its fall 2012 
meeting in central Florida (J. Orange, personal communication, May 22, 2012). 

According to Julie Orange (personal communication, February 22, 2012), having a variety of 
stakeholders on TLPIC has been useful. Stakeholders come from a variety of backgrounds and 
have varied experience levels and expertise to contribute to committee work. During the May meeting, 
committee members recommended that a subset of IHE representatives on the committee form a 
subcommittee to discuss potential revised continued approval site visit procedures because they 
have firsthand experience working with the current process. This subcommittee will then present its 
recommendations to the full committee (TLPIC) for review and collective decision making (J. Orange, 
personal communication, May 22, 2012). 

Although TLPIC has made progress, it has faced challenges. One challenge has been in organizing 
meetings. The TLPIC members are professionals who work full-time elsewhere, so coordinating 
schedules has been difficult. Although all the members agreed to serve for 4 years, some attrition 
has occurred due to retirement and unique circumstances. FLDOE replaced these members with 
new members with similar expertise to maintain the balance of the committee. 

TLPIC is also faced with the challenge of navigating new territory. A lack of model programs in 
other states and limited data sources present barriers that the committee must address as it 
moves forward with its recommendations. The committee has met via Skype with Dr. George Noell, 
a professor at Louisiana State University and the executive director for strategic research and 
analysis at the Louisiana Department of Education, to discuss Louisiana’s teacher preparation 
accountability system (FLDOE, 2011), but it has struggled to identify other states that have already 
piloted teacher preparation program accountability outcome measures (J. Orange, personal 
communication, February 22, 2012). In addition, TLPIC has expressed interest in the results of 
Florida’s new teacher evaluation system and the data it will provide regarding the performance 
evaluation of recent completers of teacher preparation programs. Because these data are not yet 
available for the 2011-12 school year, TLPIC’s recommendations will be delayed until fall 2012. 

In the meantime, committee members have begun their next task of reviewing the Uniform Core 
Curriculum and field experiences and admission requirements for teacher preparation programs 
and making recommended changes based on desired performance outcomes (J. Orange, personal 
communication, May 22, 2012). 17 

By engaging multiple groups of stakeholders, Florida has facilitated productive conversations that 
draw on the expertise of multiple groups. Florida’s recent efforts suggest that the development 
of a new evaluation of teacher preparation programs requires a thoughtful and thorough review 
of available data and measures, including their strengths and their limitations. 


17 The work of TLPIC and videos of recent meetings can be accessed at http://www.fldoe.org/committees/tlp.asp. 


34 


Research & Policy Brief 


CONCLUSION AND RECOMMENDATIONS 


As discussed in this brief, multiple measures exist for assessing the quality of teacher preparation 
programs. Certain process measures, such as syllabi reviews and surveys of teacher candidates, 
capture the quality of program content and structures. Other measures focus on outcomes, such as 
a graduate’s effects on student achievement and his or her effectiveness in the classroom. These 
measures are newer and largely untested, but despite their limitations, they hold promise for 
providing us with a greater understanding of the quality of teacher preparation programs. 

For many educators, an outcome-based approach to teacher preparation accountability marks 
a paradigm shift. As states and other organizations revise their teacher preparation program 
accountability systems, stakeholder engagement is paramount. States such as Florida and Texas 
have already harnessed the expertise that multiple stakeholders, ranging from assessment experts 
to school leaders and educators and representatives from IFH Es, can contribute. Furthermore, such 
engagement contributes to constructive change needed to meet this paradigm shift. Stakeholder 
involvement in designing and implementing the evaluation systems also ensures that they have 
an in-depth understanding of the processes, options, and challenges in the system that a state 
ultimately decides to use. 

Challenges remain regarding the design of accountability systems for teacher preparation programs. 
Additional research and capacity building are needed to bridge the divide between our current data 
and evaluation capacity and what is needed for accountability, continuous program improvement, and 
equity. As demonstrated in the state examples described in this brief, states and other organizations 
require additional ways of communicating and sharing data across organizations and states. Some 
states, such as Ohio, have broken down data silos to facilitate the exchange of data, but additional 
progress is needed. Gathering data on out-of-state preparation programs and collecting effectiveness 
data from teachers working in private or out-of-state schools remains challenging. Furthermore, 
assessing the contribution of teachers to student learning growth in subjects and grades for which 
there are no standardized tests is a major challenge. 

The revision of evaluation systems does not end with selecting and developing measures. As 
states begin to implement new evaluation methods, the field must strategically evaluate these 
methods to determine their validity, reliability, and best use. Finding the best combination of 
measures to fit each state’s context and needs will require constant monitoring and an evaluation 
of those measures. In addition, accrediting agencies, states, teacher preparation programs, and 
school districts will need to augment their data collection, management, and analysis capacity to 
maximize the utility of the data for accountability, improvement, and equity purposes. 

In the meantime, states and other organizations, in collaboration with stakeholder groups, should 
consider the strengths and the weaknesses of the available measures and select those that will 
best fit the context of the evaluation. Although each measure has inherent weaknesses, thoughtfully 
designed and carefully implemented combinations of measures can provide a more comprehensive 
and accurate picture of teacher preparation program quality than prevailing methods of evaluation 
currently do. Ensuring program effectiveness and equity requires that we do better. 
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APPENDIX. PROCESS AND OUTCOMES MEASURES: 
STRENGTHS AND WEAKNESSES 


Measure 

Examples 

Strengths 

Weaknesses 

Selectivity of 


• Supported by research 

• Only accounts for a 

teacher preparation 


that suggests that the 

relatively small amount 

programs 


selectivity of teacher 
preparation programs 
may correlate with their 
effectiveness (Decker 
et al., 2005; Henry 
et al., 2012; Kane 
et al., 2008; Rice, 2003) 

of variation in student 
achievement (Kane et al., 
2008) 

• May not account for less 
easily observable or 
measurable aspects of 
candidate aptitude that 
have a greater effect on 
teaching effectiveness 
(Farr, 2010; Rimm- 
Kaufmann et al., 2002) 


Syllabi as evidence 

• NCTQ’s evaluations of 

• Strong indicator 

• May not perfectly reflect 

of course content 

teacher preparation 

of content and 

all that is included in 


programs (see Greenberg 

requirements in teacher 

the class 


& Walsh, 2012; Walsh 

preparation coursework 

• Criteria to assess syllabi 


et al., 2006) 

• Can help identify potential 

can be difficult given 


• The Colorado Department 

redundancies or gaps in 

disagreements regarding 


of Education’s (2010) 

coursework 

the essential learning of 


use of content reviews 


teacher preparation 


as part of the program 
approval process 


coursework 


Surveys of the 
graduates of 
teacher preparation 
programs regarding 
clinical experiences 


• Texas surveyed educator 
preparation program 
candidates to determine 
whether teacher 
preparation programs met 
minimum standards for 
student teaching (Texas 
Education Agency, n.d.b) 


• Cost little to distribute 

• Gain feedback directly 
from the graduates of 
teacher preparation 
programs 

• Provide greater detail 
about the quality of 
student teaching than the 
number of hours devoted 
to clinical experiences 


• May be subject to bias 

• Rely heavily on 
estimations and 
perceptions rather 
than actual facts 

• Disseminating the survey 
in a timely manner and 
collecting sufficient 
response rates may 

be challenging 
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Measure 

Examples 

Strengths 

Weaknesses 

Document reviews 

• NCTQ’s study of student 

• Draw on preexisting data 

• May capture intention 


teaching (Greenberg 

• Provide information about 

rather than practice 


et al., 2011) 

the structure and the 
format of and the 
requirements and the 
expectations for student 
teaching 

• Provide greater detail 
about the quality of 
student teaching than the 
number of hours devoted 
to clinical experiences 

• Require development 
of criteria and time- 
consuming document 
review 

Value-added 

• Louisiana’s value-added 

• Provide a common metric 

• Provide little guidance in 

models 

model (Noell et al., 2009) 

to compare programs 

• Quantify differences 
between programs 
(Goldhaber & Liddle, 
2012; Noell et al., 2009) 

• Can be used as a trigger 
for further action 

terms of how to actually 
improve programs 

• Are imperfect (e.g., 
attribution error and bias) 

• Can be used only for a 
small number of graduates 
of teacher preparation 


programs 


• Difficult to calculate given 
that few states have fully 
tested and functional data 
systems needed for 
collecting and analyzing of 
data forVAM (Data Quality 
Campaign, 2012) 

• Can be problematic 
given the nonrandom 
assignment of graduates 
of teacher preparation 
programs 

• Do not fully capture 
desired student learning 
outcomes (Henry et al., 
2011a) 

• Do not capture significant 
variation in teacher 
training program effects 
in recent research 
(Goldhaber & Liddle, 
2012; Koedel et al., 

2012; Mihaly et al., 2012) 
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Measure 

Examples 

Strengths 

Weaknesses 

State/district 


• Provide a more 

• Lack comparability across 

teacher evaluation 


comprehensive picture of 

school districts and states 

results 


program effectiveness 

• Are based on multiple 
measures 

• Are or will soon be 
available for all public 
school teachers 

• Can help teacher 
preparation programs 
pinpoint strengths and 
weaknesses in their 
graduates' practices 

• Lack proven evidence of 
their validity and reliability 

Principal/employer 

• Texas’ Teacher Preparation 

• Supported by research, 

• Subject to bias 

surveys of teacher 

Effectiveness Survey 

which shows high 

• May rely too heavily 

effectiveness 


correlation between 
principal assessment and 
teachers’ value-added 
scores (Sartain et al., 

201 1; Tyler et al., 2010) 

• May help principals pay 
closer attention to how 
new hires are prepared 

on perception and not 
accurately measure 
practice 

Surveys of the 

• California State University 

• Are inexpensive to 

• Rely on timely 

graduates of 

System’s Exit Survey 

distribute 

administration and 

teacher preparation 


• Gather feedback from 

high response rates 

programs 


those most impacted by 
teacher preparation 
programs-the teacher 
candidates themselves 

• Rarely are common 
instruments used across 
preparation programs 

• Are subject to bias and 
reliant on perceptions 

Hiring and 


• Can help stakeholders 

• Assumes that teacher 

placement data 


understand if teacher 
preparation programs are 
preparing graduates to 
assume positions in 
high-need schools 

• Provide information about 
labor market needs 

• Are already collected data 

preparation programs have 
control over hiring and 
placement, but other 
economic and personal 
decisions may factor into 
employment decisions 
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Measure 

Examples 

Strengths 

Weaknesses 

Retention/ 
persistence in 
teaching data 


• Can indicate a problem 
if a teacher preparation 
program has a 
disproportionately high 
percentage of graduates 
that leave teaching or 
do not have contracts 
renewed 

• Overlook other potential 
reasons why teachers 
leave, such as poor school 
fit or lack of support (Liu 
& Johnson, 2006) 

Licensure exams 

• Results of Praxis II exams 
(see U.S. Department of 
Education, 2011c) 

• Are required of teacher 
candidates by nearly all 
states (Crowe, 2010; U.S. 
Department of Education, 
2011c) 

• Are not strongly correlated 
with student achievement 
(Clotfelter et al., 2007; 
Goldhaber, 2007) 

• Do not accurately predict 
whether teachers will be 
effective with groups of 
students (Goldhaber & 
Hanson, 2010) 

Performance 

assessments 

• TPA 

• Measure teachers’ 
practices rather than 
proxies 

• Are more authentic 
measures than other 
alternatives 

• Require significant effort 
from the candidate 

• Are unwieldy and time 
consuming to evaluate 

• May be subjective if not 
properly implemented 
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